Geographic utilization of artificial intelligence in real-time for disease identification and alert notification

ABSTRACT

Systems and methods for generating a diagnosis are provided. In some aspects, a computing device receives medical information for a patient, wherein each medical information item in the medical information comprises a date, a source, and a medical state. The computing device constructs, in a memory of the computing device, a diagnosis tree for the patient, wherein the diagnosis tree comprises a patient node, the patient node having first children nodes corresponding to the dates or the sources, and the first children nodes having second children nodes corresponding to the medical states. The computing device generates a diagnosis for the patient using the constructed diagnosis tree.

RELATED APPLICATIONS

The present patent document is a continuation of application Ser. No.15/387,506, filed Dec. 21, 2016, which is a continuation of applicationSer. No. 14/768,304, filed Aug. 17, 2015, now U.S. Pat. No. 9,594,878,and is a National Stage Entry of PCT/US2014/027139, filed Mar. 14, 2014,which claims the benefit of the filing date under 35 U.S.C. § 119(e) ofProvisional U.S. Patent Application Ser. No. 61/794,393, filed Mar. 15,2013 and entitled “GEOGRAPHIC UTILIZATION OF ARTIFICIAL INTELLIGENCE INREAL-TIME FOR DISEASE IDENTIFICATION AND ALERT NOTIFICATION” the entirecontents of each of which are incorporated by reference.

FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under contracts numbersW81XWH-05-1-0614, W81XWH-06-1-0785, W81XWH-09-1-0662, ANDW81XWH-11-1-0711, awarded by the United States Army Medical ResearchAcquisition Activity. The Government has certain rights to thisinvention.

BACKGROUND

The subject technology is generally directed to computer-implementeddisease identification and detection of outbreaks of diseases.

A medical professional responsible for diagnosing a patient may behesitant to diagnose the patient with an uncommon condition, forexample, due to his/her own unfamiliarity with the condition or due tohis/her own inability to believe that the condition is occurring. Also,the medical professional may be unfamiliar with other recent diagnosesin his/her geographic area. As a result, an outbreak of a rare medicalcondition in a geographic area (e.g., polio in the Chicago metropolitanarea) may be difficult to detect or may not be detected until thecondition becomes very widespread. As the foregoing illustrates a newapproach for disease identification and detection of outbreaks ofdiseases may be desirable.

BRIEF SUMMARY

Methods, computer-readable media, and systems for generating a diagnosisare provided. In some aspects, a computing device receives medicalinformation for a patient, wherein each medical information item in themedical information comprises a date, a source, and a medical state. Thecomputing device constructs, in a memory of the computing device, adiagnosis tree for the patient, wherein the diagnosis tree comprises apatient node, the patient node having first children nodes correspondingto the dates or the sources, and the first children nodes having secondchildren nodes corresponding to the medical states. The computing devicegenerates a diagnosis for the patient using the constructed diagnosistree.

Methods, computer-readable media, and systems for detecting an outbreakof a medical condition are provided. In some aspects, a computing devicereceives reports of patients having a set of medical states, each reportbeing associated with a same geographic area. The computing devicedetermines, based on data stored in a medical data repository, that theset of medical states is associated with a specified condition. Thecomputing device determines an outbreak of the specified condition inthe geographic area based on the reports of the patients having the setof medical states. The computing device provides an indication of theoutbreak of the specified condition in the geographic area.

It is understood that other configurations of the subject technologywill become readily apparent from the following detailed description,where various configurations of the subject technology are shown anddescribed by way of illustration. As will be realized, the subjecttechnology is capable of other and different configurations and itsseveral details are capable of modification in various other respects,all without departing from the scope of the subject technology.Accordingly, the drawings and detailed description are to be regarded asillustrative in nature and not as restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

Features of the subject technology are set forth in the appended claims.However, for purpose of explanation, several aspects of the disclosedsubject matter are set forth in the following figures.

FIG. 1 illustrates an example system in which examples of the subjecttechnology may be implemented.

FIG. 2 illustrates an example profile development methodology based onliterature.

FIGS. 3A-3B illustrates an example profile development methodology basedon data.

FIG. 4 illustrates an example synthetic patient generator logic.

FIG. 5 illustrates an example computing device for generating adiagnosis for a patient.

FIG. 6 illustrates an example process for generating a diagnosis for apatient.

FIG. 7 illustrates an example system for determining an outbreak of amedical condition in a geographic area.

FIG. 8 illustrates an example process for determining an outbreak of amedical condition in a geographic area.

FIG. 9 conceptually illustrates an example electronic system with whichsome implementations of the subject technology are implemented.

DETAILED DESCRIPTION OF THE DRAWINGS AND THE PRESENTLY PREFERREDEMBODIMENTS

The detailed description set forth below is intended as a description ofvarious configurations of the subject technology and is not intended torepresent the only configurations in which the subject technology may bepracticed. The appended drawings are incorporated herein and constitutea part of the detailed description. The detailed description includesspecific details for the purpose of providing a thorough understandingof the subject technology. However, it will be clear and apparent thatthe subject technology is not limited to the specific details set forthherein and may be practiced without these specific details. In someinstances, certain structures and components are shown in block diagramform in order to avoid obscuring the concepts of the subject technology.

Real-time disease surveillance may be important for early detection ofemerging infectious and non-infectious diseases, naturally occurringillnesses or the covert release of a biological threat agent. Some trendanalysis systems collect data and categorize discreet clinical variablesinto syndrome cohorts. In some cases, analysis occurs days to weeksafter the actual collection of the data. This inherent delay in analysismay, in some cases, preclude an efficient response to an emerging healththreat. The subject technology may, in some cases, implement predictiveanalytics for operational efficiencies. For example, some of thetechniques described herein may be used, among other things, toassimilate relevant data from multiple clinical sources to determine ifa defined set of criteria are met for a patient's condition. The datamay be used to determine a probability that the patient will be admittedto a hospital, develop a certain other condition, require a certaintreatment, etc. Computing device(s) implementing the subject technologymay also recommend certain treatments (e.g. to prescribe a firstmedication or not to prescribe a second medication) based on theassimilated data.

The subject technology can be implemented as a computer-implementedmethod, a computer system, or a non-transitory computer-readable mediumincluding instruction (e.g., software code). The subject technology canbe implemented on one or more computing devices. A computing device caninclude a server, a data repository, a database, a laptop computer, adesktop computer, a netbook, a mobile phone, a tablet computer, apersonal digital music player, a personal digital assistant (PDA), etc.

The subject technology, in some implementations, relates to a real-time,scalable, automated, knowledge-based disease detection and diagnosissystem. The subject technology, in some implementations, includesconducting real-time (e.g., immediate, within one minute, within tenminutes, within one hour, etc., or without any intentional delay)analysis of multiple pre-diagnostic parameters received from electronicmedical records as they are routinely documented as part of a clinicalencounter. Some implementations of the subject technology can beutilized at any location where a patient seeks treatment (e.g.,emergency departments, doctors' offices, clinics, etc.). The data beinganalyzed in some implementations of the subject technology may include,among other things, chief complaints, history and physical examinationnotes, nursing and physician notes, other provider notes, radiologydictated reports or laboratory test results. The subject technology,according to some implementations, is able to analyze discreet variablesfrom “check boxes” as well as “pull down” menus and has a robust naturallanguage processor, based on the national library of medicine, toanalyze free-text entries including comments of negation.

As used herein, the term “real-time” encompasses its plain and ordinarymeaning including, but not limited to occurring without any intentionaldelay after an entry of a command or a triggering occurrence. Forexample, a command that is executed in real-time after a mouse click canbe executed without any intentional delay after the mouse click, forexample, within one second, ten seconds, one minute, ten minutes, onehour, etc. of the mouse click.

Some implementations of the subject technology include analyzingmodule(s). The analyzing module(s) can, in some cases, have one or bothof two types: 1) pre-defined common disease definitions, for example, asprovided by an expert authority, locally, regionally, nationally orworld-wide (e.g., influenza-like illness as defined by the Centers forDisease Control (CDC) or locally occurring diseases as defined by publichealth or emergency management experts) and, 2) disease profiling forrarely occurring illnesses such as Anthrax or Smallpox. The analyzingmodule(s) can include artificial intelligence based module(s).

In some examples, the pre-defined definition module(s) can use a casedefinition for a certain medical condition as described, for example, bya higher authority content expert. One example of the certain medicalcondition is influenza-like illness (ILI) which is defined by theCenters for Disease Control (CDC). This information is programmed intocomputing device(s) implementing the subject technology and, as thecomputing device(s) receive data from the various modules of the patientelectronic health record (EHR), the received data can be analyzed andconverted into clinically useful information. The computing device(s)calculate a probability (expressed, for example, as a percent) of thepatient having a given condition (e.g., as programmed into the computingdevice(s)) and provides this information to the user. If the patient hasa high probability for a given disease entity, the user can initiate anappropriate intervention. The intervention may be, for example, aninfection control process such as wearing a mask or even a clinicalintervention such as antibiotic dispensation or vaccination.

As cumulative data is acquired from multiple patients, the computingdevice(s) can be programmed to provide a 7-day (or any other timeperiod, including a fixed time period or a variable time period) rollingstatistical averaging analysis to create thresholds. During periods ofno disease activity, these thresholds can become the “normal” level ofdisease activity in the community. The community can be defined based onage, gender, physical characteristic(s), geographic location, etc. Forexample, the community can include American adults ages 18-65 or womenover age 70 residing within 50 miles of Chicago, Ill. When diseaseactivity occurs, this threshold may be breached, providing an indication(e.g., a first indication) of an outbreak. Early recognition of aspecific disease outbreak can lead to early public health interventionson a widespread scale. The automaticity and real-time analysis that thesubject technology provides translates into disease recognition farearlier than some other approaches, which may rely on human interventionof data submission, analysis, recognition, disseminated reporting,and/or intervention.

In some implementations, the information can be relayed topre-determined parties through one or more of several differentapproaches: pager; automated reports and graphs for any time period(s)requested (e.g., hourly, daily, weekly, monthly, etc); a user interfacethrough a webpage, a mobile phone application, a tablet computerapplication, etc. The purpose for the various communication methods isto assure appropriate communication for various purposes for which thesubject technology may be used. The various purposes may include one ormore of reporting (e.g., mandatory reporting), clinical intervention, orpublic health oversight of community-based disease activity. The subjecttechnology, in some implementations, is also able to geographically mapa geographic location of where specific case(s) have emerged. Thegeographically mapped information or geographic location informationcould help public health or emergency response officials quickly containan outbreak.

In some examples, the subject technology can be used to provide thedetails set forth above for other clinical situations that may arise ina community or even from specific mass gatherings, for example, febrilerash illness, localized cutaneous lesions, acute febrile respiratoryillness, gastrointestinal illness, botulism-like illness,hemorrhagic-illness, unexplained deaths or severe illness and poison ortoxin exposure, or environmental heat illness. In these cases, thesubject technology can follow case definition if provided. If the casedefinition is not provided, the subject technology can include analyzingvarious International Statistical Classification of Diseases and RelatedHealth Problems (ICD, for example, ICD-9) codes for clinical diagnosis.In some examples, the subject technology does not rely solely on finaldiagnosis and ICD-9 codes. Instead, the subject technology can includeanalyzing entire or partial contents of the medical record with theunderstanding that a clinician may provide an inaccurate ICD-9 codewhich could lead to a misdiagnosis if the inaccurate ICD-9 code were thesole factor in determining the presence of a disease entity. Accordingto some examples, the unbiased, automated, artificial-intelligence baseddata analysis of the subject technology can eliminate the potential of aclinician drawing the wrong conclusion from the data provided during aclinical encounter. In addition, the subject technology can account forall or some aspects of the encounter which may not be readily apparentto an individual health care provider, as patients sometimes may relaytheir symptoms in piece-meal.

In contrast to the pre-defined illness programming, disease profilingfor rarely occurring illnesses can be a more elaborate process thattakes disease surveillance for rarely occurring illnesses beyond simpletrend analysis. Some of the subject technology can be used to elevatethe artificial intelligence technique(s) described above for use by aclinician prepared to recognize unfamiliar disease(s). The subjecttechnology, in some implementations, can be used to recognize theoccurrence of Category-A, Biological Threat Agents (BTAs): Anthrax,Botulism, Hemorrhagic Fever, Pneumonic Plague, Smallpox and Tularemia.The subject technology, in some implementations, can be used torecognize the occurrence of Category-B, BTAs: Pandemic Influenza, SARS,Q fever, Ricin, Brucellosis, food safety threats, West Nile virus, watersafety threats, Typhus and Glanders. The subject technology, in someimplementations, can be used to recognize chemical agents, for example,Chemical Threat Agents: nerve gas, blistering agents, choking agents andasphyxiates. The subject technology, in some implementations, can beused to recognize radiological exposures.

The infrequent presentation of rarely occurring illnesses can benefit byusing a different artificial intelligence knowledge-base, as describedherein. In these cases, the literature can be manually searched, byclinical expert(s), for the historic cases describing the actual diseaseprocess. Information can be manually garnered, by clinical expert(s),from those literature searches and tabulated in cumulative form.Weighted factors of strength can be applied to signs, symptoms, anddiagnostic studies as a disease profile or “fingerprint” is formed. Thisinformation can be programmed into computing device(s) implementingexamples of the subject technology and the profile can be run againstactual patient encounters in which the disease was not present in orderto refine the profile by studying the cause of false positive alerts.Once refined, the subject technology can be used to run the profileagainst known cases of the disease, for example, from the literaturethat were not used in the original profile development. If too fewpositive cases are present in the literature, a synthetic patient can begenerated. Again, results of false negatives can be studied to furtherrefine the final disease “fingerprint” formed as described herein.

The subject technology, in some implementations, can analyze all or somedata sources for all or some patients to determine the percentprobability of the presence of a specific disease. For these rarelyoccurring illnesses, community based thresholds, in some cases, may notexist. Thus, a positive result above a predetermined percent probabilitycan be considered significant. Once significance is established, awarning, for example, in the form of a webpage, a printed page, anelectronic (e.g., email) message or a message to a mobile device (e.g.,a short messaging service (SMS) message or an alert pushed to the mobiledevice) can be sent to designated individuals. The subject technology,in some cases, has a user interface that can also be used to monitorreal-time analysis of patient populations for rarely occurring diseaseattributes.

The subject technology, in some implementations, relates to a powerfuland highly flexible artificial intelligence framework that rapidlycompares ED patient symptoms to a library of disease profiles. Theprofiles can be useful for at least two reasons. First, these profilescan allow the encapsulation of expert medical knowledge and casesummaries in a format that may be useful for rapid categorization tasks.Second, these profiles allow knowledge to be applied at facilities thatdo not have access to the subject matter experts from which theknowledge was extracted.

In addition to the artificial intelligence-based Inference Enginecomponent and the disease profiles, the subject technology, in someexamples, can include a Pre-Processor, an Alert Notification System, aHuman Interaction System, an Automated Knowledge Acquisition System, aset of Response Packages and a set of relational databases managed by aDatabase Management System (DBMS). FIG. 1 provides an overview of asystem in which some examples of the subject technology can beimplemented.

The pre-processor, in some examples, includes three major parts. Thefirst part is a Transmission Control Protocol (TCP) server that listenson one or more specific ports for incoming messages (e.g., Health Level7 (HL7) messages). The second part is a message processor. The messageprocessor converts all of the information from the incoming messagesinto an internal format and manages any links between different messages(e.g., multiple messages for the same patient, a lab order and itsassociated results, a very large message broken into multiple smallermessages, etc). The message processor also removes allindividually-identifying information from the incoming messages in orderto maintain Health Insurance Portability and Accountability Act (HIPAA)compliance. Any pieces of information that could be used to identify apatient (e.g., SSN, name, MRN) are removed from the incoming data andstored in a separate area in the DBMS that is accessible only tospecifically authorized users. Also, all patient address information isobfuscated by first converting the address to a latitude and longitudecoordinate, and then rounding those coordinates to the nearest 100meters (or other radius).

One other part of the pre-processor, in some examples, is the NaturalLanguage Processing (NLP) sub-system. The NLP subsystem component minesthrough all or some incoming free-text for words or phrases that areassociated with one or more of the known disease profiles. As a result,the text becomes more useful to the inference engine and irrelevantinformation may be removed. Also HIPAA compliance is maintained withfree-text data as well as it is with structured data.

In some implementations, the pre-processor may be implemented withcomponent parts different from those specified above.

In some examples, the alert notification system is triggered when theinference engine generates an alert. Alerts can be triggered either byan individual case having a high probability of a particular diagnosis,or by a group of similar cases—clustered in time and in space—having aprobability above a separate, and possibly lower, BTA-specificthreshold. The alert notification system broadcasts these alerts to theappropriate people and resends unanswered alerts periodically. The humaninteraction system can be a web-based user interface or another userinterface. The interface supports monitoring and interactive explorationof stored data and provides a mechanism for users to respond to alertsgenerated by the system.

In some examples, the automated knowledge-acquisition system usesself-learning methods to improve the BTA and non-BTA detectioncapabilities described herein. This automated learning system can, insome examples, perform four major functions. First, the automatedlearning system determines the missing or new information that, ifknown, would be most effective in more accurately classifying“borderline” cases. The automatic learning system obtains theinformation either by reviewing older archived data or through anefficient interaction with expert users. Second, the automated learningsystem examines the current models contained in the knowledge base andrefines the structure and parameters of these models in order tooptimize the accuracy and efficiency of the evaluation process. Third,the automated learning system reviews all of the stored data, bothcurrent and archived, in order to discover patterns that may suggest theoccurrence of a disease that is currently unknown to the computingdevice(s) implementing the subject technology. Fourth, the automatedlearning system handles knowledge-base portability when the subjecttechnology is used to transfer clinical knowledge between differentmedical facilities. In some examples, the automated learning system canhave other functionality or can lack one or more of the functionalitiesdescribed above.

The response packages managed by the subject technology, in some cases,contain information about each known disease or some of the knowndiseases in a human-readable format. The information can includedifferential diagnoses, clinical procedures and isolation protocols. Theresponse packages can be made available through the user interface inthe event of an alert being triggered.

The subject technology can be designed to perform under conditions ofincreased volume without significant impact to system performance.Scalability of some examples of the subject technology can be importantfor at least two primary reasons. First, the surveillance of populousareas can require systems to operate at disparate levels of patientvolume. Second, in the event of an incident, the volume of patienttransactions may exceed normal levels of operation. In some cases, thesubject technology is able to adapt to these potential changes. In somecase, the subject technology can be implemented as a multi-threadedapplication with a robust TCP server and Database Management System(DBMS) to handle a wide variety of load levels.

To support heavy user connection levels, the user interface can beimplemented, using Adobe Systems' Flex technology or similartechnologies. This technology allows each computer that connects to thecomputing device(s) implementing the subject technology to handle itsown caching and rendering. Thus, the subject technology can provide avery rich cross-platform web-based application to each user withoutplacing a heavy processing demand on the server. As a result, thesubject technology can operate during times of highly concurrent access,for example, during a disease outbreak.

In some examples, clinical profiles can be developed for the Category Aand Category B BTAs as well as for more common infectious diseases suchas influenza, gastroenteritis and the common cold. The clinical profilescan be used to create the disease models and to determine the thresholdsthat the inference engine uses to determine whether or not new datagenerates an alert. The inference engine can use these disease models toclassify patient visits according to the likelihood that each visit isthe result of exposure to a known BTA.

The subject technology, in some implementations, effectively balancesthe dual challenges of early detection of individual non-infectious andinfectious agents with simultaneous detection of unusual patterns ofdisease occurrence in a target population. The subject technology can beused, among other purposes, to assist clinicians, emergency managementpersonnel and administrators in tracking, detecting and reportingemerging illnesses as well as rarely occurring disease such as potentialBTAs quickly and effectively, in order to improve responsiveness to andmitigation of the effects of a large-scale outbreak.

Real-time disease surveillance can be useful for early detection ofemerging infectious diseases and/or the covert release of a biologicalthreat agent.

In some implementations of the subject technology, computing device(s)are programmed to detect the spread of biological and infectious agentsby analyzing symptoms as patients enter the emergency department (oranother treatment facility). Some trend analysis systems look at datacollected and analyzed in a batch and sent to a lab—sometimes up to twoweeks after the patient is seen.

In some implementations of the subject technology, computing device(s)analyze the data in real-time, meaning the test results are entered intothe system and analyzed without intentional delay, for example, withinone minute, one hour, one day, or two days of the patient being seen.Using this technology, the computing device(s) can potentially identifyan outbreak of influenza or even an Anthrax attack weeks in advance ofsome other systems—possibly saving valuable time in an emergency wheneven seconds matter.

The subject technology, in some examples, includes a real-time,scalable, extensible, automated, knowledge-based biological threat agent(BTA) detection and diagnosis system implemented on computing device(s).The subject technology, in some examples, conducts real-time analysis ofmultiple pre-diagnostic parameters from records already being collectedwithin an emergency department (or other treatment facility), such astriage chief complaints, physician exam notes, and test orders andresults.

The computing device(s), in some examples, can send alerts tophysicians' pagers or mobile phones notifying the physicians of possibleor confirmed cases of bioterrorism agents, for example, Anthrax,smallpox, or plague, when they are identified (e.g., within one minuteof identification). The computing device(s) are able to map where thosecases have appeared in the city, providing powerful pieces ofinformation that could help physicians more quickly contain theoutbreak.

Examples of the subject technology can be implemented with aPre-Processor, an Inference Engine, an Alert Notification System, aHuman Interaction System, a Memory Archiver and a set of relationaldatabases managed by a Database Management System.

In some examples, the pre-processor receives messages (e.g., HL7messages) sent to the computing device(s), removes any individuallyidentifying information, and stores the HIPAA-compliant data inExperiential Memory. Each time new data is added to Experiential Memory,the inference engine sub-system determines whether or not this newinformation triggers a BTA alert. If the new data represents a confirmedcase of a known BTA, then the inference engine can update the parametersof the associated model accordingly. BTA alerts can be triggered eitherby an individual case having a probability of diagnosis that is above aBTA-specific threshold, or by a group of similar cases—clustered in timeor in space—that have a probability above a separate, and generallylower, BTA-specific threshold.

The subject technology, in some examples, can effectively balance thedual challenges of early detection of individual threat agents andsimultaneous detection of unusual patterns of disease occurrence in atarget population. Use of the subject technology can, among providingother benefits, assist clinicians in quickly and effectively detectingpotential BTAs order to better respond to and mitigate the effects of apossible large-scale outbreak.

FIG. 2 illustrates an example profile development methodology based onliterature. FIGS. 3A-3B illustrates an example profile developmentmethodology based on data. FIG. 4 illustrates an example syntheticpatient generator logic.

FIG. 5 illustrates an example computing device 500 for generating adiagnosis for a patient. As shown, the computing device 500 includes aprocessing unit 505, a network interface 510, and a memory 515. Theprocessing unit 505 includes one or more processors. The processing unit505 may include a central processing unit (CPU), a graphics processingunit (GPU), or any other processing unit. The processing unit 505executes computer instructions that are stored in a computer-readablemedium, for example, the memory 515. The network interface 510 allowsthe computing device 500 to transmit and receive data in the network590, which may include one or more of a local area network, a wide areanetwork, a wired network, a wireless network, the Internet, a cellularnetwork, etc. Using the network interface 510, the computing device 500may communicate with remote computer(s) connected to the network 590,for example, a data repository 595. The memory 515 stores data and/orinstructions. The memory 515 may be one or more of a cache unit, astorage unit, an internal memory unit, or an external memory unit. Asillustrated, the memory 515 includes a diagnosis generating module 520and a diagnosis tree 525.

The diagnosis generating module 520 may be implemented in software andmay store instructions. The instructions, when executed, may cause theprocessing unit 505 to receive medical information for a patient. Themedical information may be received via input device(s) (e.g., akeyboard or mouse) of the computing device 500 or via the network 590.The medical information may include multiple medical information itemshaving different sources (e.g., medical records, physician assessments,test results, etc.). Each medical information item may include a date, asource, and a medical state. The medical state may be a symptom, a sign,a finding, or a test result. In some cases, the source is a test type(e.g., weight test) and the medical state is a test result (e.g., weighs160 pounds). In some cases, the source is a medical record and themedical state is a fact noted in the medical record.

The diagnosis generating module 520 may include instructions which, whenexecuted by the processing unit 505, cause the processing unit 505 toconstruct, in the memory 515 of the computing device 500, a diagnosistree 525 for the patient. The diagnosis tree 525 includes a patient node530. The children nodes of the patient node 530 are date/source nodes535.1-2, which include a date and/or a source of medical information.The children of the date/source nodes 535.1-2 are medical state nodes540.1-4, which include medical state information. While each parent nodeis illustrated as having two children nodes, in alternativeimplementations, each parent node may have any number of children nodes.Thus, there may be any number, not necessarily two, of date/source nodes535, and any number, not necessarily four, of medical state nodes 540.Also, while a single diagnosis tree 525 for a single patient isillustrated, there may be multiple diagnosis trees for multiplepatients.

The instructions in the diagnosis generating module 520, when executed,may cause the processing unit 505 to generate a diagnosis for thepatient using the constructed diagnosis tree 525. In some examples, theprocessing unit 505 may compare the constructed diagnosis tree 525 tostored diagnostic information items for multiple diagnoses. The storeddiagnostic information items may be stored in the data repository 595.The processing unit 505 may generate the diagnosis based on a similarityof the constructed diagnosis tree 525 to one or more of the storeddiagnostic information items. The data repository 595 may be accessibleto the computing device 500, including the processing unit 505, via thenetwork 590.

FIG. 6 illustrates an example process 600 for generating a diagnosis fora patient. The process 600 begins at step 610, where a computing device(e.g., computing device 500) receives medical information for a patient.The medical information includes medical information items. Each medicalinformation item includes a date, a source, and a medical state.

In step 620, the computing device constructs, in a memory (e.g., memory515) of the computing device, a diagnosis tree for the patient. Thediagnosis tree includes a patient node (e.g., patient node 530). Thepatient node has first children nodes (e.g., date/source nodes 535)corresponding to the dates or the sources. The first children nodes havesecond children nodes (e.g., medical state nodes 540) corresponding tothe medical states. The date in the medical information and/or in thefirst children nodes may be a medical encounter date, and the sourcesmay include free-form audio or text provided by a user. The medicalinformation may include a fact determined based on the free-form audioor text.

In step 630, the computing device generates a diagnosis for the patientusing the constructed diagnosis tree. The generated diagnosis may beprovided to the patient or to a medical professional working with thepatient via a display of the computing device or via a message (e.g.,email, text message, or push notification to mobile device) generated atthe computing device. After step 630, the process 600 ends.

FIG. 7 illustrates an example system 700 for determining an outbreak ofa medical condition in a geographic area. As shown, the system 700includes a medical data repository 705 and a computing device 725connected to one another via a network 720. The network 720 may includeone or more of a local area network, a wide area network, a wirednetwork, a wireless network, the Internet, a cellular network, etc.

As shown, the medical data repository 705 stores conditions 710.1-3.Each stored condition 710.k is associated with medical state(s) 715.k,which may include expected measurement(s) or range(s) of symptom(s),sign(s), finding(s) or test result(s) for persons having the condition710.k. Using the medical data repository 705, an input condition may beassociated with output medical state(s) or input medical state(s) may beassociated with an output condition. The medical data repository 705 mayimplement any data structure for associating condition(s) 710.1-3 withmedical states 715.1-3, for example, a table, a hash table, a linkedlist, etc. Furthermore, the medical data repository 705 is shown asincluding three conditions 710.1-3 and associated medical states715.1-3. However, the medical data repository 705 may store any numberof conditions and associated medical states.

As shown, the computing device 725 includes a processing unit 730, anetwork interface 735, and a memory 740. The processing unit 730includes one or more processors. The processing unit 730 may include acentral processing unit (CPU), a graphics processing unit (GPU), or anyother processing unit. The processing unit 730 executes computerinstructions that are stored in a computer-readable medium, for example,the memory 740. The network interface 735 allows the computing device725 to transmit and receive data in the network 720. Using the networkinterface 735, the computing device 725 may communicate with remotecomputer(s) connected to the network 720, for example, the medical datarepository 705. The memory 740 stores data and/or instructions. Thememory 740 may be one or more of a cache unit, a storage unit, aninternal memory unit, or an external memory unit. As illustrated, thememory 740 includes patient reports 745.1-3 and a medical conditionoutbreak detection module 760.

The medical condition outbreak detection module 760 includesinstructions which, when executed by the processing unit 730, cause theprocessing unit 730 to receive the patient reports 745.1-3. Each patientreport 745.k includes medical state(s) 750.k and a geographic area755.k. The medical state(s) may include symptom(s), sign(s), finding(s)or test result(s). The geographic area 755.k may be a geographic area(e.g., a metropolitan area, for instance, the Los Angeles metropolitanarea). The geographic area may correspond to a geographic area of atreatment facility generating the report (e.g., a treatment facility inthe Los Angeles metropolitan area may be associated with Los Angeles) orthe geographic area may correspond to a default (e.g., home or work)geographic area of the patient.

The exact home or work location of the patient may be obfuscated toprotect the patient's privacy. For example, a patient report mayindicate that a patient lives in the San Francisco metropolitan area,but may obfuscate that the patient's home address is 123 Main Street,Palo Alto, Calif. in order to protect the patient's privacy. In otherwords, the geographic area 755.k may lack a geographic location, withinthe identified geographic area, associated with the patient. In somecases, exact home or work location(s) of patient(s) may not beobfuscated for accessing specific demographic information if permittedby the laws of the jurisdiction(s) in which the subject technology isimplemented.

In some cases, each of the patient reports 745.1-3 is associated withthe same geographic area. For example, all of the reports may be fromthe Los Angeles metropolitan area. As shown, there are three patientreports 745.1-3. However, the subject technology may be implemented withany number of patient reports 745.

The medical condition outbreak detection module 760 includesinstructions which, when executed by the processing unit 730, cause theprocessing unit 730 to determine, based on data stored in the medicaldata repository 705, that the set of medical states 750 for at least athreshold number (e.g., 1000) of patient reports 745 in the samegeographic area is associated with a specified condition (e.g., at least1000 people in metropolitan Chicago have Anthrax). The processing unit730 may use the associations of medical states 715.k with conditions710.k, stored in the medical data repository 705, to make thisdetermination. The specified condition may be a diagnosable disease.

Determining that the set of medical states is associated with thespecified condition may include diagnosing, via the computing device725, the patients associated with the patient reports as having thespecified condition.

The medical condition outbreak detection module 760 includesinstructions which, when executed by the processing unit 730, cause theprocessing unit 730 to determine an outbreak of the specified conditionin the geographic area based on the reports of the at least thethreshold number of patients having the set of medical states. Themedical condition outbreak detection module 760 includes instructionswhich, when executed by the processing unit 730, cause the processingunit 730 to provide, to a user of the computing device 725, anindication of the outbreak of the specified condition.

FIG. 8 illustrates an example process 800 for determining an outbreak ofa medical condition in a geographic area. The process 800 begins at step810, where a computing device (e.g., computing device 725) receivesreports (e.g., patient reports 745) of patients having a set of medicalconditions. Each report is associated with the same geographic area. Thecomputing device may also receive other reports associated withdifferent geographic areas. The computing device may receive the reportsfrom various treatment facilities (e.g., hospitals, clinics, doctors'offices) over a network (e.g., network 720) using a secure and encryptednetwork communication technology.

In step 820, the computing device determines, based on data stored in amedical repository (e.g., medical data repository 705), that the set ofmedical states in the reports is associated with a specified condition.For example, the computing device may determine that at least athreshold number (e.g., 2000) of reports correspond to patients in theState of Rhode Island suffering from polio.

In step 830, the computing device determines an outbreak of thespecified condition (e.g., polio) in the geographic area (e.g., RhodeIsland) based on the reports of the patients having the set of medicalstates. If a large number (e.g., a number exceeding a threshold) ofpatients in a geographic area have a rare condition, there is likely anoutbreak of the condition in the geographic area

In step 840, the computing device provides an indication of the outbreakof the specified condition in the geographic area. The indication of theoutbreak may be provided via a display unit of the computing device orvia an electronic message (e.g., email, text message or pushnotification to a mobile device) transmitted from the computing deviceto a predetermined messaging address or message-receiving device.

As a result of some implementations of the subject technology, a diseaseoutbreak in a geographic area may be detected more quickly and moreefficiently and, thus, responded to more quickly and more efficiently.The response to the disease outbreak may include, for example, provisionof medical supplies for treatment. After step 840, the process 800 ends.

FIG. 9 conceptually illustrates an electronic system 900 with which someimplementations of the subject technology are implemented. For example,one or more of the data repository 595, the medical data repository 705,or the computing devices 500 and 725 may be implemented using thearrangement of the electronic system 900. The electronic system 900 canbe a computer (e.g., a mobile phone, PDA), or any other sort ofelectronic device. Such an electronic system includes various types ofcomputer readable media and interfaces for various other types ofcomputer readable media. Electronic system 900 includes a bus 905,processing unit(s) 910, a system memory 915, a read-only memory 920, apermanent storage device 925, an input device interface 930, an outputdevice interface 935, and a network interface 940.

The bus 905 collectively represents all system, peripheral, and chipsetbuses that communicatively connect the numerous internal devices of theelectronic system 900. For instance, the bus 905 communicativelyconnects the processing unit(s) 910 with the read-only memory 920, thesystem memory 915, and the permanent storage device 925.

From these various memory units, the processing unit(s) 910 retrievesinstructions to execute and data to process in order to execute theprocesses of the subject technology. The processing unit(s) can be asingle processor or a multi-core processor in different implementations.

The read-only-memory (ROM) 920 stores static data and instructions thatare needed by the processing unit(s) 910 and other modules of theelectronic system. The permanent storage device 925, on the other hand,is a read-and-write memory device. This device is a non-volatile memoryunit that stores instructions and data even when the electronic system900 is off. Some implementations of the subject technology use amass-storage device (for example a magnetic or optical disk and itscorresponding disk drive) as the permanent storage device 925.

Other implementations use a removable storage device (for example afloppy disk, flash drive, and its corresponding disk drive) as thepermanent storage device 925. Like the permanent storage device 925, thesystem memory 915 is a read-and-write memory device. However, unlikestorage device 925, the system memory 915 is a volatile read-and-writememory, such a random access memory. The system memory 915 stores someof the instructions and data that the processor needs at runtime. Insome implementations, the processes of the subject technology are storedin the system memory 915, the permanent storage device 925, or theread-only memory 920. For example, the various memory units includeinstructions for generating a diagnosis or detecting an outbreak of amedical condition in accordance with some implementations. From thesevarious memory units, the processing unit(s) 910 retrieves instructionsto execute and data to process in order to execute the processes of someimplementations.

The bus 905 also connects to the input and output device interfaces 930and 935. The input device interface 930 enables the user to communicateinformation and select commands to the electronic system. Input devicesused with input device interface 930 include, for example, alphanumerickeyboards and pointing devices (also called “cursor control devices”).Output device interfaces 935 enables, for example, the display of imagesgenerated by the electronic system 900. Output devices used with outputdevice interface 935 include, for example, printers and display devices,for example cathode ray tubes (CRT) or liquid crystal displays (LCD).Some implementations include devices for example a touch screen thatfunctions as both input and output devices.

Finally, as shown in FIG. 9, bus 905 also couples electronic system 900to a network (not shown) through a network interface 940. In thismanner, the electronic system 900 can be a part of a network ofcomputers (for example a local area network (LAN), a wide area network(WAN), or an Intranet, or a network of networks, for example theInternet. Any or all components of electronic system 900 can be used inconjunction with the subject technology.

The above-described features and applications can be implemented assoftware processes that are specified as a set of instructions recordedon a computer readable storage medium (also referred to as computerreadable medium). When these instructions are executed by one or moreprocessing unit(s) (e.g., one or more processors, cores of processors,or other processing units), they cause the processing unit(s) to performthe actions indicated in the instructions. Examples of computer readablemedia include, but are not limited to, CD-ROMs, flash drives, RAM chips,hard drives, EPROMs, etc. The computer readable media does not includecarrier waves and electronic signals passing wirelessly or over wiredconnections.

In this specification, the term “software” is meant to include firmwareresiding in read-only memory or applications stored in magnetic storageor flash storage, for example, a solid-state drive, which can be readinto memory for processing by a processor. Also, in someimplementations, multiple software technologies can be implemented assub-parts of a larger program while remaining distinct softwaretechnologies. In some implementations, multiple software technologiescan also be implemented as separate programs. Finally, any combinationof separate programs that together implement a software technologydescribed here is within the scope of the subject technology. In someimplementations, the software programs, when installed to operate on oneor more electronic systems, define one or more specific machineimplementations that execute and perform the operations of the softwareprograms.

A computer program (also known as a program, software, softwareapplication, script, or code) can be written in any form of programminglanguage, including compiled or interpreted languages, declarative orprocedural languages, and it can be deployed in any form, including as astand alone program or as a module, component, subroutine, object, orother unit suitable for use in a computing environment. A computerprogram may, but need not, correspond to a file in a file system. Aprogram can be stored in a portion of a file that holds other programsor data (e.g., one or more scripts stored in a markup languagedocument), in a single file dedicated to the program in question, or inmultiple coordinated files (e.g., files that store one or more modules,sub programs, or portions of code). A computer program can be deployedto be executed on one computer or on multiple computers that are locatedat one site or distributed across multiple sites and interconnected by acommunication network.

These functions described above can be implemented in digital electroniccircuitry, in computer software, firmware or hardware. The techniquescan be implemented using one or more computer program products.Programmable processors and computers can be included in or packaged asmobile devices. The processes and logic flows can be performed by one ormore programmable processors and by one or more programmable logiccircuitry. General and special purpose computing devices and storagedevices can be interconnected through communication networks.

Some implementations include electronic components, for examplemicroprocessors, storage and memory that store computer programinstructions in a machine-readable or computer-readable medium(alternatively referred to as computer-readable storage media,machine-readable media, or machine-readable storage media). Someexamples of such computer-readable media include RAM, ROM, read-onlycompact discs (CD-ROM), recordable compact discs (CD-R), rewritablecompact discs (CD-RW), read-only digital versatile discs (e.g., DVD-ROM,dual-layer DVD-ROM), a variety of recordable/rewritable DVDs (e.g.,DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD cards, mini-SDcards, micro-SD cards, etc.), magnetic or solid state hard drives,read-only and recordable Blu-Ray® discs, ultra density optical discs,any other optical or magnetic media, and floppy disks. Thecomputer-readable media can store a computer program that is executableby at least one processing unit and includes sets of instructions forperforming various operations. Examples of computer programs or computercode include machine code, for example is produced by a compiler, andfiles including higher-level code that are executed by a computer, anelectronic component, or a microprocessor using an interpreter.

While the above discussion primarily refers to microprocessor ormulti-core processors that execute software, some implementations areperformed by one or more integrated circuits, for example applicationspecific integrated circuits (ASICs) or field programmable gate arrays(FPGAs). In some implementations, such integrated circuits executeinstructions that are stored on the circuit itself

As used in this specification and any claims of this application, theterms “computer”, “server”, “processor”, and “memory” all refer toelectronic or other technological devices. These terms exclude people orgroups of people. For the purposes of the specification, the termsdisplay or displaying means displaying on an electronic device. As usedin this specification and any claims of this application, the terms“computer readable medium” and “computer readable media” are entirelyrestricted to tangible, physical objects that store information in aform that is readable by a computer. These terms exclude any wirelesssignals, wired download signals, and any other ephemeral signals.

To provide for interaction with a user, implementations of the subjectmatter described in this specification can be implemented on a computerhaving a display device, e.g., a cathode ray tube (CRT) or liquidcrystal display (LCD) monitor, for displaying information to the userand a keyboard and a pointing device, e.g., a mouse or a trackball, bywhich the user can provide input to the computer. Other kinds of devicescan be used to provide for interaction with a user as well; for example,feedback provided to the user can be any form of sensory feedback, e.g.,visual feedback, auditory feedback, or tactile feedback; and input fromthe user can be received in any form, including acoustic, speech, ortactile input. In addition, a computer can interact with a user bysending documents to and receiving documents from a device that is usedby the user; for example, by sending web pages to a web browser on auser's client device in response to requests received from the webbrowser.

The subject matter described in this specification can be implemented ina computing system that includes a back end component, e.g., as a dataserver, or that includes a middleware component, e.g., an applicationserver, or that includes a front end component, e.g., a client computerhaving a graphical user interface or a Web browser through which a usercan interact with an implementation of the subject matter described inthis specification, or any combination of one or more such back end,middleware, or front end components. The components of the system can beinterconnected by any form or medium of digital data communication,e.g., a communication network. Examples of communication networksinclude a local area network (LAN) and a wide area network (WAN), aninter-network (e.g., the Internet), and peer-to-peer networks (e.g., adhoc peer-to-peer networks).

The computing system can include clients and servers. A client andserver are generally remote from each other and typically interactthrough a communication network. The relationship of client and serverarises by virtue of computer programs running on the respectivecomputers and having a client-server relationship to each other. In someaspects of the disclosed subject matter, a server transmits data (e.g.,an HTML page) to a client device (e.g., for purposes of displaying datato and receiving user input from a user interacting with the clientdevice). Data generated at the client device (e.g., a result of the userinteraction) can be received from the client device at the server.

It is understood that any specific order or hierarchy of steps in theprocesses disclosed is an illustration of example approaches. Based upondesign preferences, it is understood that the specific order orhierarchy of steps in the processes may be rearranged, or that allillustrated steps be performed. Some of the steps may be performedsimultaneously. For example, in certain circumstances, multitasking andparallel processing may be advantageous. Moreover, the separation ofvarious system components illustrated above should not be understood asrequiring such separation, and it should be understood that thedescribed program components and systems can generally be integratedtogether in a single software product or packaged into multiple softwareproducts.

Various modifications to these aspects will be readily apparent, and thegeneric principles defined herein may be applied to other aspects. Thus,the claims are not intended to be limited to the aspects shown herein,but is to be accorded the full scope consistent with the languageclaims, where reference to an element in the singular is not intended tomean “one and only one” unless specifically so stated, but rather “oneor more.” Unless specifically stated otherwise, the term “some” refersto one or more. Pronouns in the masculine (e.g., his) include thefeminine and neuter gender (e.g., her and its) and vice versa. Headingsand subheadings, if any, are used for convenience only and do not limitthe subject technology.

A phrase, for example, an “aspect” does not imply that the aspect isessential to the subject technology or that the aspect applies to allconfigurations of the subject technology. A disclosure relating to anaspect may apply to all configurations, or one or more configurations. Aphrase, for example, an aspect may refer to one or more aspects and viceversa. A phrase, for example, a “configuration” does not imply that suchconfiguration is essential to the subject technology or that suchconfiguration applies to all configurations of the subject technology. Adisclosure relating to a configuration may apply to all configurations,or one or more configurations. A phrase, for example, a configurationmay refer to one or more configurations and vice versa.

1-21. (canceled)
 22. A method comprising receiving, at a computingdevice, reports of patients having a set of medical states, each reportbeing associated with a same geographic area; determining, based on datastored in a medical data repository, that the set of medical states isassociated with a specified condition; determining an outbreak of thespecified condition in the geographic area based on the reports of thepatients having the set of medical states; and providing an indicationof the outbreak of the specified condition in the geographic area. 23.The method of claim 22, wherein the set of medical states comprises aset of symptoms, signs, findings, or test results.
 24. The method ofclaim 22, wherein the specified condition comprises a diagnosabledisease.
 25. The method of claim 22, wherein determining that the set ofmedical states is associated with the specified condition comprisesdiagnosing, via the computing device, the patients as having thespecified condition.
 26. The method of claim 22, wherein the geographicarea comprises a metropolitan area, corresponds to a geographic locationof treatment facilities generating the reports, or corresponds to adefault geographic location of the patients, and wherein the defaultgeographic location of each patient is obfuscated.
 27. (canceled) 28.(canceled)
 29. The method of claim 22, wherein each report of eachpatient having the set of medical states lacks a geographic location,within the geographic area, associated with the corresponding patient.30. The method of claim 22, wherein providing the indication of theoutbreak of the specified condition in the geographic area comprisesproviding, via a display unit of the computing device, the indication ofthe outbreak or providing, via an electronic message transmitted fromthe computing device, the indication of the outbreak.
 31. (canceled) 32.A non-transitory computer-readable medium comprising instructions which,when executed by one or more computers, cause the one or more computersto: receive reports of patients having a set of medical states, eachreport being associated with a same geographic area; determine, based ondata stored in a medical data repository, that the set of medical statesis associated with a specified condition; determine an outbreak of thespecified condition in the geographic area based on the reports of thepatients having the set of medical states; and provide an indication ofthe outbreak of the specified condition in the geographic area.
 33. Thecomputer-readable medium of claim 32, wherein the set of medical statescomprises a set of symptoms, signs, findings, or test results.
 34. Thecomputer-readable medium of claim 32, wherein the specified conditioncomprises a diagnosable disease.
 35. The computer-readable medium ofclaim 32, wherein the instructions to determine that the set of medicalstates is associated with the specified condition comprises instructionswhich, when executed by the one or more computers, cause the one or morecomputers to diagnose the patients as having the specified condition.36. The computer-readable medium of claim 32, wherein the geographicarea comprises a metropolitan area, corresponds to a geographic locationof treatment facilities generating the reports, or corresponds to adefault geographic location of the patients, and wherein the defaultgeographic location of each patient is obfuscated.
 37. (canceled) 38.(canceled)
 39. The computer-readable medium of claim 32, wherein eachreport of each patient having the set of medical states lacks ageographic location, within the geographic area, associated with thecorresponding patient.
 40. The computer-readable medium of claim 32,wherein the instructions to provide the indication of the outbreak ofthe specified condition in the geographic area comprise instructionswhich, when executed by the one or more computers, cause the one or morecomputers to provide, via a display unit of the one or more computers,the indication of the outbreak or provide, via an electronic messagetransmitted from the one or more processors, the indication of theoutbreak.
 41. (canceled)
 42. A system comprising: one or moreprocessors; and a memory comprising instructions which, when executed byone or more processors, cause the one or more processors to: receivereports of patients having a set of medical states, each report beingassociated with a same geographic area; determine, based on data storedin a medical data repository, that the set of medical states isassociated with a specified condition; determine an outbreak of thespecified condition in the geographic area based on the reports of thepatients having the set of medical states; and provide an indication ofthe outbreak of the specified condition in the geographic area.
 43. Thesystem of claim 42, wherein the set of medical states comprises a set ofsymptoms, signs, findings, or test results.
 44. (canceled)
 45. Thesystem of claim 42, wherein the instructions to determine that the setof medical states is associated with the specified condition comprisesinstructions which, when executed by the one or more processors, causethe one or more processors to diagnose the patients as having thespecified condition.
 46. The system of claim 42, wherein the geographicarea comprises a metropolitan area, corresponds to a geographic locationof treatment facilities generating the reports, or corresponds to adefault geographic location of the patients, and wherein the defaultgeographic location of each patient is obfuscated.
 47. (canceled) 48.(canceled)
 49. The system of claim 42, wherein each report of eachpatient having the set of medical states lacks a geographic location,within the geographic area, associated with the corresponding patient.50. The system of claim 42, wherein the instructions to provide theindication of the outbreak of the specified condition in the geographicarea comprise instructions which, when executed by the one or moreprocessors, cause the one or more processors to provide, via a displayunit, the indication of the outbreak or provide, via an electronicmessage transmitted from the one or more processors, the indication ofthe outbreak.
 51. (canceled)